94 research outputs found

    Lexical database enrichment through semi-automated morphological analysis

    Get PDF
    Derivational morphology proposes meaningful connections between words and is largely unrepresented in lexical databases. This thesis presents a project to enrich a lexical database with morphological links and to evaluate their contribution to disambiguation. A lexical database with sense distinctions was required. WordNet was chosen because of its free availability and widespread use. Its suitability was assessed through critical evaluation with respect to specifications and criticisms, using a transparent, extensible model. The identification of serious shortcomings suggested a portable enrichment methodology, applicable to alternative resources. Although 40% of the most frequent words are prepositions, they have been largely ignored by computational linguists, so addition of prepositions was also required. The preferred approach to morphological enrichment was to infer relations from phenomena discovered algorithmically. Both existing databases and existing algorithms can capture regular morphological relations, but cannot capture exceptions correctly; neither of them provide any semantic information. Some morphological analysis algorithms are subject to the fallacy that morphological analysis can be performed simply by segmentation. Morphological rules, grounded in observation and etymology, govern associations between and attachment of suffixes and contribute to defining the meaning of morphological relationships. Specifying character substitutions circumvents the segmentation fallacy. Morphological rules are prone to undergeneration, minimised through a variable lexical validity requirement, and overgeneration, minimised by rule reformulation and restricting monosyllabic output. Rules take into account the morphology of ancestor languages through co-occurrences of morphological patterns. Multiple rules applicable to an input suffix need their precedence established. The resistance of prefixations to segmentation has been addressed by identifying linking vowel exceptions and irregular prefixes. The automatic affix discovery algorithm applies heuristics to identify meaningful affixes and is combined with morphological rules into a hybrid model, fed only with empirical data, collected without supervision. Further algorithms apply the rules optimally to automatically pre-identified suffixes and break words into their component morphemes. To handle exceptions, stoplists were created in response to initial errors and fed back into the model through iterative development, leading to 100% precision, contestable only on lexicographic criteria. Stoplist length is minimised by special treatment of monosyllables and reformulation of rules. 96% of words and phrases are analysed. 218,802 directed derivational links have been encoded in the lexicon rather than the wordnet component of the model because the lexicon provides the optimal clustering of word senses. Both links and analyser are portable to an alternative lexicon. The evaluation uses the extended gloss overlaps disambiguation algorithm. The enriched model outperformed WordNet in terms of recall without loss of precision. Failure of all experiments to outperform disambiguation by frequency reflects on WordNet sense distinctions

    Criticality, factorization and long-range correlations in the anisotropic XY-model

    Get PDF
    We study the long-range quantum correlations in the anisotropic XY-model. By first examining the thermodynamic limit we show that employing the quantum discord as a figure of merit allows one to capture the main features of the model at zero temperature. Further, by considering suitably large site separations we find that these correlations obey a simple scaling behavior for finite temperatures, allowing for efficient estimation of the critical point. We also address ground-state factorization of this model by explicitly considering finite size systems, showing its relation to the energy spectrum and explaining the persistence of the phenomenon at finite temperatures. Finally, we compute the fidelity between finite and infinite systems in order to show that remarkably small system sizes can closely approximate the thermodynamic limit.Comment: 8 pages, 8 figures. Close to published versio

    Lexical database enrichment through semi-automated morphological analysis

    Get PDF
    Derivational morphology proposes meaningful connections between words and is largely unrepresented in lexical databases. This thesis presents a project to enrich a lexical database with morphological links and to evaluate their contribution to disambiguation. A lexical database with sense distinctions was required. WordNet was chosen because of its free availability and widespread use. Its suitability was assessed through critical evaluation with respect to specifications and criticisms, using a transparent, extensible model. The identification of serious shortcomings suggested a portable enrichment methodology, applicable to alternative resources. Although 40% of the most frequent words are prepositions, they have been largely ignored by computational linguists, so addition of prepositions was also required. The preferred approach to morphological enrichment was to infer relations from phenomena discovered algorithmically. Both existing databases and existing algorithms can capture regular morphological relations, but cannot capture exceptions correctly; neither of them provide any semantic information. Some morphological analysis algorithms are subject to the fallacy that morphological analysis can be performed simply by segmentation. Morphological rules, grounded in observation and etymology, govern associations between and attachment of suffixes and contribute to defining the meaning of morphological relationships. Specifying character substitutions circumvents the segmentation fallacy. Morphological rules are prone to undergeneration, minimised through a variable lexical validity requirement, and overgeneration, minimised by rule reformulation and restricting monosyllabic output. Rules take into account the morphology of ancestor languages through co-occurrences of morphological patterns. Multiple rules applicable to an input suffix need their precedence established. The resistance of prefixations to segmentation has been addressed by identifying linking vowel exceptions and irregular prefixes. The automatic affix discovery algorithm applies heuristics to identify meaningful affixes and is combined with morphological rules into a hybrid model, fed only with empirical data, collected without supervision. Further algorithms apply the rules optimally to automatically pre-identified suffixes and break words into their component morphemes. To handle exceptions, stoplists were created in response to initial errors and fed back into the model through iterative development, leading to 100% precision, contestable only on lexicographic criteria. Stoplist length is minimised by special treatment of monosyllables and reformulation of rules. 96% of words and phrases are analysed. 218,802 directed derivational links have been encoded in the lexicon rather than the wordnet component of the model because the lexicon provides the optimal clustering of word senses. Both links and analyser are portable to an alternative lexicon. The evaluation uses the extended gloss overlaps disambiguation algorithm. The enriched model outperformed WordNet in terms of recall without loss of precision. Failure of all experiments to outperform disambiguation by frequency reflects on WordNet sense distinctions.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Low rank perturbations and the spectral statistics of pseudointegrable billiards

    Full text link
    We present an efficient method to solve Schr\"odinger's equation for perturbations of low rank. In particular, the method allows to calculate the level counting function with very little numerical effort. To illustrate the power of the method, we calculate the number variance for two pseudointegrable quantum billiards: the barrier billiard and the right triangle billiard (smallest angle π/5\pi/5). In this way, we obtain precise estimates for the level compressibility in the semiclassical (high energy) limit. In both cases, our results confirm recent theoretical predictions, based on periodic orbit summation.Comment: 4 page

    Investigation of pathogenic mechanisms in multiple colorectal adenoma patients without germline APC or MYH/MUTYH mutations

    Get PDF
    Patients with multiple (5–100) colorectal adenomas (MCRAs) often have no germline mutation in known predisposition genes, but probably have a genetic origin. We collected a set of 25 MCRA patients with no detectable germline mutation in APC, MYH/MUTYH or the mismatch repair genes. Extracolonic tumours were absent in these cases. No vertical transmission of the MCRA phenotype was found. Based on the precedent of MYH-associated polyposis (MAP), we searched for a mutational signature in 241 adenomatous polyps from our MCRA cases. Somatic mutation frequencies and spectra at APC, K-ras and BRAF were, however, similar to those in sporadic colorectal adenomas. Our data suggest that the genetic pathway of tumorigenesis in the MCRA patients' tumours is very similar to the classical pathway in sporadic adenomas. In sharp contrast to MAP tumours, we did not find evidence of a specific mutational signature in any individual patient or in the overall set of MCRA cases. These results suggest that hypermutation of APC does not cause our patients' disease and strongly suggests that MAP is not a paradigm for the remaining MCRA patients. Our MCRA patients' colons showed no evidence of microadenomas, unlike in MAP and familial adenomatous polyposis (FAP). However, nuclear β-catenin expression was significantly greater in MCRA patients' tumours than in sporadic adenomas. We suggest that, at least in some cases, the MCRA phenotype results from germline variation that acts subsequent to tumour initiation, perhaps by causing more rapid or more likely progression from microadenoma to macroadenoma

    The eClinical Care Pathway Framework: A novel structure for creation of online complex clinical care pathways and its application in the management of sexually transmitted infections.

    Get PDF
    Despite considerable international eHealth impetus, there is no guidance on the development of online clinical care pathways. Advances in diagnostics now enable self-testing with home diagnosis, to which comprehensive online clinical care could be linked, facilitating completely self-directed, remote care. We describe a new framework for developing complex online clinical care pathways and its application to clinical management of people with genital chlamydia infection, the commonest sexually transmitted infection (STI) in England.Using the existing evidence-base, guidelines and examples from contemporary clinical practice, we developed the eClinical Care Pathway Framework, a nine-step iterative process. Step 1: define the aims of the online pathway; Step 2: define the functional units; Step 3: draft the clinical consultation; Step 4: expert review; Step 5: cognitive testing; Step 6: user-centred interface testing; Step 7: specification development; Step 8: software testing, usability testing and further comprehension testing; Step 9: piloting. We then applied the Framework to create a chlamydia online clinical care pathway (Online Chlamydia Pathway).Use of the Framework elucidated content and structure of the care pathway and identified the need for significant changes in sequences of care (Traditional: history, diagnosis, information versus Online: diagnosis, information, history) and prescribing safety assessment. The Framework met the needs of complex STI management and enabled development of a multi-faceted, fully-automated consultation.The Framework provides a comprehensive structure on which complex online care pathways such as those needed for STI management, which involve clinical services, public health surveillance functions and third party (sexual partner) management, can be developed to meet national clinical and public health standards. The Online Chlamydia Pathway's standardised method of collecting data on demographics and sexual behaviour, with potential for interoperability with surveillance systems, could be a powerful tool for public health and clinical management.UKCRC Translational Infection Research (TIR) Initiative supported by the Medical Research Council, eSTI2 Consortium (Grant Number G0901608)

    Clearance of apoptotic cells: implications in health and disease

    Get PDF
    Recent advances in defining the molecular signaling pathways that regulate the phagocytosis of apoptotic cells have improved our understanding of this complex and evolutionarily conserved process. Studies in mice and humans suggest that the prompt removal of dying cells is crucial for immune tolerance and tissue homeostasis. Failed or defective clearance has emerged as an important contributing factor to a range of disease processes. This review addresses how specific molecular alterations of engulfment pathways are linked to pathogenic states. A better understanding of the apoptotic cell clearance process in healthy and diseased states could offer new therapeutic strategies
    • …
    corecore